NEWS | IN DEPTH 


COMPUTER SCIENCE 

Beyond the Hiring Test 

Concluding that there is no one test for machine intelligence, 
AI researchers develop a battery of research challenges 


By JiaYou 

» s the movie The Imitation Game cel¬ 
ebrates British mathematician Alan 
Turing’s contributions to the Allied 
victory in World War II, the artificial 
intelligence (AI) community is re¬ 
thinking another of his legacies: the 
Turing Test. 

In 1950, Turing laid out an ap¬ 
pealingly simple test for whether a 
machine possesses human-level intel¬ 
ligence: Will a person conversing with 
it in text mistake it for another human 
being? But more than 60 years later, 
it’s time for new criteria, says com¬ 
puter scientist Leora Morgenstern of 
Leidos Inc. in Reston, Virginia. “We 
now know a lot about AI and what’s 
needed to make progress. It’s a big 
leap from Turing’s time.” 

At a 25 January workshop at the 
29th Association for the Advancement 
of Artificial Intelligence conference 
in Austin, Morgenstern and other re¬ 
searchers will discuss proposals for 
a new Turing Championship. In con¬ 
trast with Turing’s single litmus test, 
the proposed challenges acknowl¬ 
edge that intelligence has multiple 
dimensions—from language compre¬ 
hension to social awareness—that are 
best tackled piece by piece. 

Over the years, Turing’s origi¬ 
nal idea has grown into a small 
industry while drawing increasing crit¬ 
icism. Competitions such as the long- 
running Loebner Prize ask human 
judges to text chat with either a person or 
a computer program for less than 30 min¬ 
utes and then determine the converser’s 
identity. In June, a computer program 
named Eugene Goostman, which adopts 
the persona of a 13-year-old Ukrainian boy, 
was declared to have passed a Turing Test 
organized by the University of Reading in 
the United Kingdom after fooling a third of 
the judges in 5-minute conversations. Yet 
researchers such as cognitive scientist Gary 
Marcus of New York University in New York 
City argue that such competitions put a 
premium on stock answers and other ruses. 
“It’s a parlor trick,” Marcus says. “There’s no 
sense in which that program is genuinely 
intelligent.” The new Turing Championship 

116 9 JANUARY 2015 • VOL 347 ISSUE 6218 


would motivate researchers to develop ma¬ 
chines with a deeper understanding of the 
world, argues Marcus, who is co-organizing 
the workshop. 

One set of proposed challenges focuses 
on common-sense reasoning, which re¬ 
mains a tall order for machines yet is cru¬ 
cial for comprehending language. Take the 
sentence, “The trophy would not fit in the 


brown suitcase because it was too big.” De¬ 
ducing that “it” refers to the trophy, not the 
suitcase, requires general knowledge that is 
second nature for a person but difficult to 
program into a machine. Next fall, in what 
could be the first of the new Turing chal¬ 
lenges, the industry-sponsored Winograd 
Schema Challenge will test machines’ com¬ 
prehension of such grammatically ambigu¬ 
ous sentences. 

A second set of proposed challenges cen¬ 
ters on machine vision. With new machine¬ 
learning techniques that train computers to 
discern objects, researchers at places such 
as Google and Facebook are developing al¬ 
gorithms that can guide a self-driving car 
or automatically identify any face in any 


photograph. But AI researchers want ma¬ 
chines to understand and reason with what 
they see, says computer scientist Fei-Fei Li 
of Stanford University in Palo Alto, Califor¬ 
nia. The challenge Li will propose would ask 
machines to tell stories from pictures—not 
only identifying an object such as a coffee 
mug, for example, but also noting that it 
sits half-empty on a table because someone 
drank from it. Such machines might one day 
interpret what she calls the “dark matter of 
the digital age”: images and videos, which 
today’s search engines and bots can hardly 
make sense of. 

For machines to truly assist people in their 
daily lives, physical movement smoothly in¬ 
tegrated with language and perceptual skills 
has to be part of the mix, says computer 
scientist Charles Ortiz of the Nuance 
Natural Language and AI Laboratory 
in Sunnyvale, California. His proposed 
challenge would ask both a machine 
and a human to manipulate a robotic 
arm in order to, say, play with a toy. 

At the same time, they would carry on 
a conversation about their actions. As 
in Turing’s original test, a judge would 
evaluate the “humanness” of the com¬ 
puter’s performance. 

Intelligence has one more dimen¬ 
sion, says computer scientist Bar¬ 
bara Grosz of Harvard University: 
teamwork. To effectively collaborate 
with humans, machines will need to 
understand their teammates’ prefer¬ 
ences, share information appropriately, 
and handle uncertain environments. 
Grosz’s challenge would pair comput¬ 
ers with people in group activities, such 
as formulating health care plans, to 
test whether people overlook that their 
partners aren’t human. 

Many more research challenges will 
be debated at the workshop, aimed at 
capabilities from long-term learning 
to creativity. The goal, Marcus says, 
is to winnow the proposals down to 
three to five competitions. A balance of am¬ 
bition and realism is key, says computer sci¬ 
entist Stuart Shieber of Harvard. “You want 
to design competitions that are qualitatively 
beyond the current level of AI, but not so far % 
that... it would be like setting an X prize for i 
space flight in da Vinci’s era,” he says. § 

Although it’s unlikely that consensus will g 
emerge in January, the discussion will con- I 
tinue at another AI conference in July, says | 

co-organizer Manuela Veloso of Carnegie | 
Mellon University in Pittsburgh, Pennsyl- £ 
vania. By early 2016, the organizers hope to | 
stage a set of trial competitions that will be % 
revised and repeated regularly. “If we don’t I 
move fast, it won’t happen,” Veloso says, g 
“People will lose momentum.” ■ £ 

sciencemag.org SCIENCE 



“We can only see a short distance 
ahead, but we can see plenty there 
that needs to be done.” 

Alan Turing, 1950 


Published by AAAS 


Downloaded from www.sciencemag.org on January 9, 2015 








